Chemically Ligated RNAs for CRISPR/Cas9-lgRNA Complexes as Antiviral Therapeutic Agents

ABSTRACT

Provided herein are chemically ligated guide RNA oligonucleotides (lgRNA) which comprise two functional RNA modules (crgRNA and tracrgRNA) joined by non-nucleotide chemical linkers (nNt-linker), their complexes with CRISPR-Cas9, and cells comprising lgRNAs, preparation methods of Cas9-1gRNA complexes, and their uses for prevention and treatments of viral infections in humans. Also disclosed are processes and methods for preparation of these compounds.

CROSS REFERENCE TP RELATED APPLICATIONS

The present application is a continuation of U.S. application Ser. No. 15/950,268, filed Apr. 11, 2018, which is a divisional application of U.S. application Ser. No. 15/006,131, filed Jan. 26, 2016 now U.S. Pat. No. 10,059,940, with title “Chemically Ligated RNAs for CRISPR/Cas9-lgRNA Complexes as Antiviral Therapeutic Agents” and naming Minghong Zhong as inventor(s), and claims the benefits of U.S. Provisional Application Ser. No. 62/108,064, the entire said invention being incorporated herein by reference.

TECHNICAL FIELD OF THE INVENTION

The present invention relates to chemically synthesized guide RNA oligonucleotides (lgRNA) comprising two functional RNA modules (crgRNA and tracrgRNA) ligated by non-nucleotide chemical linker(s) (nNt-linker), their complexes with CRISPR-Cas9, preparation methods of Cas9-lgRNA complexes, and uses as medicinal agents in treatment of chronic viral infections.

BACKGROUND OF THE INVENTION

Chronic viral infections such as HIV, HBV, and HSV afflict enormous suffering, life loss, and financial burdening among the infected individuals. These infectious diseases are incurable, and contagious to variable degrees, and are prominent threats for public health, highlighting the urgent needs for curative therapies. To date, effective antiviral therapies only suppress viral replication but do not clear virus in patients, and do not target the viral genetic materials of latently integrated (proviral DNA) or non-replicating episomal viral genomes (such as cccDNA) in human cells. Nevertheless, these viral DNAs have been reported to directly cause these chronic or latent infections.

Recent breakthroughs in biotechnology have identified several molecular gene editing tools such as zinc finger nucleases (ZFNs), transcription activator-like (TAL) effector nucleases (TALENs), homing endonucleases (HEs), and most notably the clusters of regularly interspaced short palindromic repeat (CRISPR)-associated protein Cas9. These nucleases are highly specific, DNA-cleaving enzymes, and have been recently demonstrated as potential therapeutic applications to eradicate HBV, HIV, HSV, and herpes virus by targeting the enzymatic systems directly to essential viral genome sequences.

Both ZFNs and TALENs are composed of protein-based programmable, sequence-specific DNA-binding modules and nonspecific DNA cleavage domains, which make multiple-site targeting extremely challenging. This is even more challenging for HEs, which have the same protein domains for both DNA binding and cleavage.

Unlike nucleases such as ZFNs, TALENs, and HEs, CRISPR-Cas-RNA complexes contain short CRISPR RNAs (crRNAs), which direct the Cas proteins to the target nucleic acids via Watson-Crick base pairing to facilitate nucleic acid destruction. Three types (I-III) of CRISPR-Cas systems have been functionally identified across a wide range of microbial species. While the type I and III CRISPR systems utilize ensembles of Cas proteins complexed with crRNAs to mediate the recognition and subsequent degradation of target nucleic acids, the type II CRISPR system recognizes and cleaves the target DNA via the RNA-guided endonuclease Cas9 along with two noncoding RNAs, the crRNA and the trans-activating crRNA (tracrRNA). The crRNA::tracrRNA complex directs Cas9 DNA endonuclease to the protospacer on the target DNA next to the protospacer adjacent motif (PAM) for site-specific cleavage (FIG. 1). This system is further simplified by fusing the crRNA and tracrRNA into a single chimeric guide RNA (sgRNA) with enhanced efficiency, and can be easily reprogrammed to cleave virtually any DNA sequence by redesigning the crRNAs or sgRNAs.

The CRISPR/Cas9 system has been evaluated as potential therapeutic strategy to cure chronic and/or latent viral infections such as HIV, HBV, and Epstein-Barr virus (EBV). Hu and et al. reported CRISPR/Cas9 system could eliminate the integrated HIV-1 genome by targeting the HIV-1 LTR U3 region in single and multiplex configurations. It inactivated viral gene expression and replication in latently infected microglial, promonocytic, and T cells, completely excised a 9,709-bp fragment of integrated proviral DNA that spanned from its 5′ to 3′ LTRs, and caused neither genotoxicity nor off-target editing to the host cells. Yang and et al. reported the CRISPR/Cas9 system could significantly reduce the production of HBV core and surface proteins in Huh-7 cells transfected with an HBV-expression vector, and disrupt the HBV expressing templates both in vitro and in vivo, indicating its potential in eradicating persistent HBV infection. They observed that two combinatorial gRNAs targeting different sites could increase the efficiency in causing indels. The study by Seeger and Sohn also reported CRISPR/Cas9 efficiently inactivated HBV genes in NTCP expressing HepG2 cells permissive for HBV infection. Wang and Quake observed patient-derived cells from a Burkitt's lymphoma with latent Epstein-Barr virus infection presented dramatic proliferation arrest and a concomitant decrease in viral load after exposure to a CRISPR/Cas9 vector targeted to the viral genome and a mixture of seven guide RNAs at the same molar ratio via plasmid.

In spite of initial success in evaluation of CRISPR/Cas9 for therapeutic applications in treatment of chronic or latent viral infections, off-target cleavage is a major concern for any nuclease therapy. Studies indicated that a 10-12 bp “seed” region located at the 3′-end of the protospacer sequence is critical for its site-specific cleavage, and that Cas9 can tolerate up to seven mismatches at the 5′-end of the 20-base protospacer sequence.

Cas9::sgRNA was reported to be a single turnover enzyme, and was tightly bound by the cleaved DNA products. In thermodynamic term, the energetic increase in the unwound DNA helix (two DNA single strand) in Cas9 is compensated by free energy decrease of the hybridization between DNA and crRNA and of the binding of Cas9 with the formed DNA:crRNA helix and with the resulting single strand DNA; to release the cleaved DNA products, especially the hybridized DNA strand with crRNA, the free energy increase in the released DNA requires compensation of free energy decrease by certain processes such as binding of additional protein factor(s), enabling the recycling of the Cas::sgRNA enzyme, which is believed to happen under physiological conditions. A more related but also more complicated in vitro assay comprising both Cas9::sgRNA and other binding molecules or cellular assay is needed for reliable SAR based on multiple turnover enzymatic reactions.

Another major concern is the requirement of more sophisticated approaches to delivery. To date, the Cas9 protein and RNAs (either dualRNA (crRNA/tracrRNA) or a single guide RNA (sgRNA, ˜100 nt)) have been mostly introduced by plasmid transfection either as whole or separately, which makes chemical modifications of RNAs extremely challenging, if not impossible, and also is limited by random integration of all or part of the plasmid DNA into the host genome and by persistent and elevated expression of Cas9 in target cells that could lead to off-target effects. A recent study presented lipid-mediated delivery of unmodified Cas9::sgRNA complexes using common cationic lipid nucleic acid transfection reagents such as Lipofectamine resulted in up to 80% genome modification with substantially higher specificity compared to DNA transfection, and be effective both in vitro and in vivo. Another example presented treatment with cell penetrating peptide (CPP)-conjugated recombinant Cas9 protein and CPP-complexed guide RNAs led to endogenous gene disruptions in human cell lines with lower off-target mutation frequencies than plasmid transfection.

To minimize or completely eradicate off-target effects, and to overcome other major obstacles to its pharmaceutical applications including the lack of stability of RNA, low potency of Cas9-gRNA at the target sites, large sizes of Cas9 proteins (˜150 kDa), chemically modified sgRNAs may provide a very effective strategy as indicated by the therapeutic application of antisense oligonucleotides and progress in small interfering RNA technology, and also supported by a recent report on chemically modified, 29-nucleotide synthetic CRISPR RNA (scrRNA) in combination with unmodified transactivating crRNA (tracrRNA), showing a comparable efficacy as sgRNA, though as dual guide RNAs, this combination should be less effective than sgRNA incorporated with same chemical modifications because of entropic and other factors. However, considering the size of sgRNA (˜100 nt), large scale chemically manufacturing such large molecules is industrially challenging and costly. Methods for preparation of long RNAs (sgRNA) compatible to extensive chemical modifications and/or significantly truncated sgRNAs or crRNA/tracrRNAs of lengths compatible to current industrial RNA chemical synthesis are in need. Truncating sgRNAs or crRNA/tracrRNAs is limited because of the many essential binding interactions between Cas9 and sgRNA and the complicated molecular mechanisms of recognition and cleavage of target DNAs. A possible solution can be based on recent progress in chemical ligations of nucleic acids. Brown and et al. showed the CuAAC reaction (click chemistry) in conjunction with solid-phase synthesis could produce catalytically active hairpin ribozymes around 100 nucleotides in length.

Recent disclosure of the structure of CRISPR/Cas9-sgRNA complexes indicated that the linkloop (tetraloop) of the sgRNA protrudes outside of the CRISPR/Cas9-sgRNA complex, with the distal 4 base pairs (bp) completely free of interactions with Cas9 peptide side chains (FIG. 2). The natural guide RNAs comprise CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA), directing Cas9 to a specific genomic locus harboring the target sequence, in spite of their lower efficacy than sgRNA. Therefore this tetraloop can be replaced by small molecule non-nucleotide linkers (nNt-linker), and sgRNA can be divided into two pieces of ˜30-32 nt (crgRNA) and ˜60 nt (tracrgRNA) (FIG. 3), respectively, or into three pieces (˜30-32 nt (crgRNA), ˜30 nt (tracrgRNA1), and ˜30 nt (tracrgRNA2)), or multiple pieces based on the void of interactions between certain other sections of sgRNA and Cas9, and joined by chemical ligations in vitro as ligated guide RNAs (lgRNA). These RNAs (crgRNA and tracrgRNA) are more readily synthesized chemically at industrial scale, and can even be further shortened by introductions of chemical modifications at various sites, and therefore more commercially accessible, and lgRNAs can be optimized by chemical modifications for better efficacy and specificity and for targeted delivery.

As presented by previous studies, targeting viral DNA at multiple sites could enhance the effectiveness, which can be better practiced by delivering whole CRISPR/Cas9-lgRNA complexes composed of chemical libraries of different crgRNAs (including various spacers) and formed in vitro. This chemical ligation strategy can provide diverse chemically modified lgRNAs targeting multiple sites and/or variants/mutations of a single site in viral genomes equivalent to combination therapies such as HAART.

SUMMARY OF THE INVENTION

This invention pertains to chemically ligated guide RNA oligonucleotides (lgRNA) comprising two functional RNA modules (crgRNA and tracrgRNA, or crgRNA and dual-tracrgRNA or multiple-tracrgRNA) joined by chemical nNt-linkers, their complexes with CRISPR-Cas9, preparation methods for Cas9-lgRNA complexes and their uses in the treatments of viral infections in humans, represented by the following structure (Formula I):

wherein

(a) spacer (NNNNNNNNNNNNN . . . ) is an oligonucleotide derived from viral genomes corresponding to a viral DNA sequence (protospacer) immediately before NGG or any of other protospacer adjacent motifs (PAM);

(b) spacers (NNNNNNNNNNNNN . . . ) comprise nucleotides modified at sugar moieties such as 2′-deoxyribonucleotides, 2′-methoxyribonucleotides, 2′-F-ribonucleotides, 2′-F-arabinonucleotides, 2′-O,4′-C-methylene nucleotides (LNA), unlocked nucleotides (UNA), nucleoside phosphonoacetate (PACE), thiophosphonoacetate (thioPACE), and phosphoromonothioates:

wherein Q is a nucleic acid base;

(c) A . . . B base pair is selected from A . . . U, U . . . A, C . . . G, and G . . . C;

(d) “nNt-Linker” is a chemical moiety joining the 3′-end of crgRNA and 5′-end of tracrgRNA, and is constructed by click chemistry or chemical ligations; the attaching position at the 3′-end nucleotide of crgRNA can be selected from 3′-, 2′-, and 7- (for 7-deazapurinenucleoside) or 5-(for pyrimidine nucleosides) or 8-(for purine nucleosides) positions, and the attaching position at the 5′-end nucleotide of tracrgRNA can be selected from 5′-, 2′-, and 7-(for 7-deazapurinenucleoside) or 5-(for pyrimidine nucleosides) or 8-(for purine nucleosides) positions, as represented by non-limiting examples below:

(d) Nucleotides in [ ] ([CU] and [GA]) can be deleted or replaced with other nucleotides preferably with base pair(s) unaffected with or without modifications of sugar moieties;

(e) The Cas9-lgRNA complexes can be prepared in the following steps (FIG. 4):

Step 1. Synthesis of crgRNA and tracrgRNA. Structures of non-limiting examples are given below:

Step 2. Chemical ligation of crgRNA and tracrgRNA and annealing to form RNA duplex, lgRNA:

or annealing followed by chemical ligation to form lgRNA.

Step 3. In vitro complex formation between lgRNA and Cas9 protein.

(f) crgRNA in (e) can be single oligonucleotide or a mixture of oligonucleotides of various sequences, and thus the preparation gives a mixture of combinatorial Cas9-lgRNA complexes targeting multiple sites in viral genomes.

In some embodiments, tracrgRNA is a single RNA with chemical modifications or without chemical modifications.

In some embodiments, tracrgRNA is a ligated dual oligonucleotide, comprising tracrgRNA1 and tracrgRNA2, or multiple oligonucleotides.

Provided herein are chemically ligated oligonucleotides for CRISPR-Cas9, their complexes with CRISPR-Cas9, preparation methods and uses of these complexes as medicinal agents in treatment of viral infections. Embodiments of these structures, preparations, and applications are described in details herein.

In certain embodiments, lgRNAs comprise modified nucleotides in crgRNAs, and non-limiting examples include 2′-deoxyribonucleotides, 2′-methoxyribonucleotides, 2′-F-ribonucleotides, 2′-F-arabinonucleotides, 2′-methoxyethoxyribonucleotides, 2′-0,4′-C-methylene nucleotides (LNA), unlocked nucleotide analogues (UNA), nucleoside phosphonoacetate (PACE), thiophosphonoacetate (thioPACE), and phosphoromonothioates.

In certain embodiments, lgRNAs comprise modified nucleotides in tracrgRNAs, and non-limiting examples include 2′-deoxyribonucleotides, 2′-methoxyribonucleotides, 2′-F-ribonucleotides, 2′-F-arabinonucleotides, 2′-methoxyethoxyribonucleotides, 2′-0,4′-C-methylene nucleotides (LNA), unlocked nucleotide analogues (UNA), nucleoside phosphonoacetate (PACE), thiophosphonoacetate (thioPACE), and phosphoromonothioates.

In certain embodiments, lgRNAs comprise nucleotides modified in base moieties of tracrgRNAs, and non-limiting examples include 2,6-diaminopurine, 5-fluorouracil, pseudouracil, and 7-fluoro-7-deazapurine.

In certain embodiments, lgRNAs comprise nucleotides modified in base moieties of crgRNA, and non-limiting examples include 2,6-diaminopurine, 5-fluorouracil, pseudouracil, and 7-fluoro-7-deazapurine.

In certain embodiments, lgRNAs comprise modified nucleotides in both crgRNAs and tracrgRNAs.

In certain embodiments, lgRNAs comprise nucleotides modified in both base moieties and sugar moieties.

In certain embodiments, the nNt-linker comprises a 1,4-disubstituted 1,2,3-triazole formed between an alkyne and an azide via [3+2] cycloaddition catalyzed by Cu(I) or without Cu(I) catalysis (such as strain-promoted azide-alkyne cycloaddition (SPAAC)).

In certain embodiments, the nNt-linker is a thioether by chemical ligation between a thiol and a maleimide, or other functional groups.

In some embodiments, the 3′-end nucleotide of crgRNA is a modified U or C, and the nNt-linker is attached at 5-position of the nucleoside base, or a 7-deazaguanine or 7-deazaadenine, and the nNt-linker is attached at 7-position of the nucleoside base, or a purine nucleotide, and the nNt-linker is attached at the 8-position of the nucleoside base.

In other embodiments is, and the nNt-linker is attached at the 2′- or 3′-position of the sugar moiety of the 3′-end nucleotide of crgRNA.

In some embodiments, the 5′-end nucleotide of tracrgRNA is a modified U or C, and the nNt-linker is attached at 5-position of the nucleoside base, or a 7-deazaguanine or 7-deazaadenine, and the nNt-linker is attached at 7-position of the nucleoside base, or a purine nucleotide, and the nNt-linker is attached at the 8-position of the nucleoside base.

In other embodiments, the nNt-linker is attached at the 2′- or 5′-position of the sugar moiety of the 5′-end nucleotide of tracrgRNA.

In some embodiments, the 5′-end nucleotide of tracrgRNA is base-paired with the 3′-end nucleotide of crgRNA as A . . . 0 or G . . . C.

In other embodiments, the 5′-end nucleotide of tracrgRNA and the 3′-end nucleotide of crgRNA are not base-paired.

In certain embodiments, lgRNAs comprise spacers in crgRNA selected from sequences of 12˜20 nt of HIV genomes, of which each thymine is replaced with uracil, immediately before a PAM (such as NGG). The spacer RNA oligomers have the same sequences as the sense strands (5′→3′) of the genomes, and namely are their RNA transcripts with or without further chemical modifications, or have the same sequences as the antisense strands (5′ →3′) of the genomes, and namely are their antisense RNA transcripts with or without further chemical modifications.

In certain embodiments, lgRNAs comprise spacers in crgRNA selected from sequences of 12˜20 nt of HBV genomes, of which each thymine is replaced with uracil, immediately before a PAM (such as NGG). The spacer RNA oligomers have the same sequences as the sense strands (5′→3′) of the genomes, and namely are their RNA transcripts with or without further chemical modifications, or have the same sequences as the antisense strands (5′ →3′) of the genomes, and namely are their antisense RNA transcripts with or without further chemical modifications.

In certain embodiments, lgRNAs comprise spacers in crgRNA selected from sequences of 12˜20 nt of HSV genomes, of which each thymine is replaced with uracil, immediately before a PAM (such as NGG). The spacer RNA oligomers have the same sequences as the sense strands (5′→3′) of the genomes, and namely their RNA transcripts with or without further chemical modifications, or have the same sequences as the antisense strands (5′→3′) of the genomes, and namely are their antisense RNA transcripts with or without further chemical modifications.

In certain embodiments, lgRNAs comprise spacers in crgRNA selected from sequences of 12˜20 nt of EBV genomes, of which each thymine is replaced with uracil, immediately before a PAM (such as NGG). The spacer RNA oligomers have the same sequences as the sense strands (5′→3′) of the genomes, and namely their RNA transcripts with or without further chemical modifications, or have the same sequences as the antisense strands (5′→3′) of the genomes, and namely are their antisense RNA transcripts with or without further chemical modifications.

In certain embodiments, lgRNAs are mixtures comprising lgRNA constructs with different spacers corresponding to different loci of viral genomes and/or variants of a single locus of target genomes. Useful selection methods to identify sequences having extremely low to no homology between the foreign viral genome and host cellular genome including endogenous retroviral DNA include bioinformatics screening to minimize and/or exclude off-target human transcriptome and essential untranslated genomic sites, and to optimize the efficacy of target cleavage.

DETAILED DESCRIPTION OF THE INVENTION

An aspect of the invention is directed to chemically ligated guide RNA oligonucleotides (lgRNA) comprising two functional RNA modules (crgRNA and tracrgRNA, crgRNA and dual-tracrgRNA or multiple-tracrgRNA) joined by chemical nNt-linkers, their complexes with CRISPR-Cas9, preparation methods of Cas9-lgRNA complexes and their uses in the treatments of viral infections in humans.

In some embodiments, this chemical ligation strategy provides diverse chemically modified lgRNAs for optimization for better efficacy in cleaving viral genomic DNAs.

In other embodiments, this chemical ligation strategy provides diverse chemically modified lgRNAs for minimizing off-target cleavages of host genomic DNAs with or without engineering Cas9 proteins.

In some embodiments, this chemical ligation strategy provides diverse chemically modified lgRNAs for decreasing the size of Cas9-lgRNA complex by engineering Cas9 proteins, or for full functional substitution of Cas9 with smaller natural/engineered CRISPR-associated proteins thus more amenable to delivery in human cells, for efficient administrations and better dosage forms.

In other embodiments, Cas9-lgRNA complexes are formulated with transfecting agents such as cationic lipids, cationic polymers and/or cell penetrating peptides for antiviral therapies, either alone or in combination with other direct-acting antiviral agents (DAA).

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1: shows crRNA (Top, SEQ ID NO: 5)::tracrRNA (Bottom, SEQ ID NO: 6) complex in upper diagram, and schematic illustration of a Cas9/gRNA complex bound at its target DNA site at the bottom.

FIG. 2: shows molecular surface representation of the crystal structure of a Cas9/gRNA complex, and the tetraloop is free of any interactions with Cas9.

FIG. 3: shows a molecular model of lgRNA. The ligation nNt-linker (ligation1) is a triazole.

FIG. 4: shows a general procedure to prepare a Cas9-lgRNA complex.

DEFINITION

The definitions of terms used herein are consistent to those known to those of ordinary skill in the art, and in case of any differences the definitions are used as specified herein instead.

The term “nucleoside” as used herein refers to a molecule composed of a heterocyclic nitrogenous base, containing an N-glycosidic linkage with a sugar, particularly a pentose. An extended term of “nucleoside” as used herein also refers to acyclic nucleosides and carbocyclic nucleosides.

The term “nucleotide” as used herein refers to a molecule composed of a nucleoside monophosphate, di-, or triphosphate containing a phosphate ester at 5′-, 3′-position or both. The phosphate can also be a phosphonate.

The term of “oligonucleotide” (ON) is herein used interchangeably with “polynucleotide”, “nucleotide sequence”, and “nucleic acid”, and refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. An oligonucleotide may comprise one or more modified nucleotides, which may be imparted before or after assembly of such as oligonucleotide. The sequence of nucleotides may be interrupted by non-nucleotide components.

The term of “crispr/cas9” refers to the type II CRISPR-Cas system from Streptococcus pyogenes. The type II CRISPR-Cas system comprises protein Cas9 and two noncoding RNAs (crRNA and tracrRNA). These two noncoding RNAs were further fused into one single guide RNA (sgRNA). The Cas9/sgRNA complex binds double-stranded DNA sequences that contain a sequence match to the first 17-20 nucleotides of the sgRNA and immediately before a protospacer adjacent motif (PAM). Once bound, two independent nuclease domains (HNH and RuvC) in Cas9 each cleaves one of the DNA strands 3 bases upstream of the PAM, leaving a blunt end DNA double stranded break (DSB).

The term of “off-target effects” refers to non-targeted cleavage of the genomic DNA target sequence by Cas9 in spite of imperfect matches between the gRNA sequence and the genomic DNA target sequence. Single mismatches of the gRNA can be permissive for off-target cleavage by Cas9. Off-target effects were reported for all the following cases: (a) same length but with 1-5 base mismatches; (b) off-target site in target genomic DNA has one or more bases missing (‘deletions’); (c) off-target site in target genomic DNA has one or more extra bases (‘insertions’).

The term of “guide RNA” (gRNA) refers to a synthetic fusion of crRNA and tracrRNA via a tetraloop (GAAA) (defined as sgRNA) or other chemical linkers such as an nNt-Linker (defined as lgRNA), and is used interchangeably with “chimeric RNA”, “chimeric guide RNA”, “single guide RNA” and “synthetic guide RNA”. The gRNA contains secondary structures of the repeat:anti-repeat duplex, stem loops 1-3, and the linker between stem loops 1 and 2.

The term of “dual RNA” refers to hybridized complex of the short CRISPR RNAs (crRNA) and the trans-activating crRNA (tracrRNA). The crRNA hybridizes with the tracrRNA to form a crRNA:tracrRNA duplex, which is loaded onto Cas9 to direct the cleavage of cognate DNA sequences bearing appropriate protospacer-adjacent motifs (PAM).

The term of “lgRNA” refers to guide RNA (gRNA) joined by chemical ligations to form non-nucleotide linkers (nNt-linkers) between crgRNA and tracrgRNA, or at other sites.

The terms of “dual lgRNA”, “triple lgRNA” and “multiple lgRNA” refer to hybridized complexes of the synthetic guide RNA fused by chemical ligations via non-nucleotide linkers. Dual tracrgRNA is formed by chemical ligation between tracrgRNA1 and tracrgRNA2 (RNA segments of ˜30 nt), and crgRNA (˜30 nt) is fused with a dual tracrgRNA to form a triple lgRNA duplex, which is loaded onto Cas9 to direct the cleavage of cognate DNA sequences bearing appropriate protospacer-adjacent motifs (PAM). Each RNA segment can be readily accessible by chemical manufacturing and compatible to extensive chemical modifications.

The term “guide sequence” refers to the about 20 bp sequence within the guide RNA that specifies the target site and is herein used interchangeably with the terms “guide” or “spacer”. The term “tracr mate sequence” may also be used interchangeably with the term “direct repeat(s)”.

The term of “crgRNA” refers to crRNA equipped with chemical functions for conjugation/ligation and is used interchangeably with crRNA in an lgRNA comprising at least one non-Nucleotide linker. The oligonucleotide may be chemically modified close to its 3′-end, any one or several nucleotides, or for its full sequence.

The term of “tracrgRNA” refers to tracrRNA equipped with chemical functions for conjugation/ligation and is used interchangeably with tracrRNA in an lgRNA comprising at least one non-Nucleotide linker. The oligonucleotide may be chemically modified at any one or several nucleotides, or for its full sequence.

The term of “the protospacer adjacent motif (PAM)” refers to a DNA sequence immediately following the DNA sequence targeted by Cas9 in the CRISPR bacterial adaptive immune system, including NGG, NNNNGATT, NNAGAA, NAAAC, and others from different bacterial species where N is any nucleotide.

The term of “chemical ligation” refers to joining together synthetic oligonucleotides via an nNt-linker by chemical methods such as click ligation (the azide-alkyne reaction to produce a triazole linkage), thiol-maleimide reaction, and formations of other chemical groups.

The term of “complementary” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson-Crick or other non-traditional types. Cas9 contains two nuclease domains, HNH and RuvC, which cleave the DNA strands that are complementary and noncomplementary to the 20 nucleotide (nt) guide sequence in crRNAs, respectively.

The term of “Hybridization” refers to a reaction in which one or more polynucleotides form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self-hybridizing strand, or any combination of these. A sequence capable of hybridizing with a given sequence is referred to as the “complement” of the given sequence.

The synonymous terms “hydroxyl protecting group” and “alcohol-protecting group” as used herein refer to substituents attached to the oxygen of an alcohol group commonly employed to block or protect the alcohol functionality while reacting other functional groups on the compound. Examples of such alcohol-protecting groups include but are not limited to the 2-tetrahydropyranyl group, 2-(bisacetoxyethoxy)methyl group, trityl group, trichloroacetyl group, carbonate-type blocking groups such as benzyloxycarbonyl, trialkylsilyl groups, examples of such being trimethylsilyl, tert-butyldimethylsilyl, tert-butyldiphenylsilyl, phenyldimethylsilyl, triiospropylsilyl and thexyldimethylsilyl, ester groups, examples of such being formyl, (C₁-C₁₀) alkanoyl optionally mono-, di- or tri-substituted with (C₁-C₆) alkyl, (C₁-C₆) alkoxy, halo, aryl, aryloxy or haloaryloxy, the aroyl group including optionally mono-, di- or tri-substituted on the ring carbons with halo, (C₁-C₆) alkyl, (C₁-C₆) alkoxy wherein aryl is phenyl, 2-furyl, carbonates, sulfonates, and ethers such as benzyl, p-methoxybenzyl, methoxymethyl, 2-ethoxyethyl group, etc. The choice of alcohol-protecting group employed is not critical so long as the derivatized alcohol group is stable to the conditions of subsequent reaction(s) on other positions of the compound of the formula and can be removed at the desired point without disrupting the remainder of the molecule. Further examples of groups referred to by the above terms are described by J. W. Barton, “Protective Groups In Organic Chemistry”, J. G. W. McOmie, Ed., Plenum Press, New York, N.Y., 1973, and G. M. Wuts, T. W. Greene, “Protective Groups in Organic Synthesis”, John Wiley & Sons Inc., Hoboken, N.J., 2007, which are hereby incorporated by reference. The related terms “protected hydroxyl” or “protected alcohol” define a hydroxyl group substituted with a hydroxyl protecting group as discussed above.

The term “nitrogen protecting group,” as used herein, refers to groups known in the art that are readily introduced on to and removed from a nitrogen atom. Examples of nitrogen protecting groups include but are not limited to acetyl (Ac), trifluoroacetyl, Boc, Cbz, benzoyl (Bz), N,N-dimethylformamidine (DMF), trityl, and benzyl (Bn). See also G. M. Wuts, T. W. Greene, “Protective Groups in Organic Synthesis”, John Wiley & Sons Inc., Hoboken, N.J., 2007, and related publications.

The term of “Isotopically enriched” refers to a compound containing at least one atom having an isotopic composition other than the natural isotopic composition of that atom. The term of “Isotopic composition” refers to the amount of each isotope present for a given atom, and “natural isotopic composition” refers to the naturally occurring isotopic composition or abundance for a given atom. As used herein, an isotopically enriched compound optionally contains deuterium, carbon-13, nitrogen-15, and/or oxygen-18 at amounts other than their natural isotopic compositions.

As used herein, the terms “therapeutic agent” and “therapeutic agents” refer to any agent(s) which can be used in the treatment or prevention of a disorder or one or more symptoms thereof. In certain embodiments, the term “therapeutic agent” includes a compound provided herein. In certain embodiments, a therapeutic agent is an agent known to be useful for, or which has been or is currently being used for the treatment or prevention of a disorder or one or more symptoms thereof.

Nucleotides

In some embodiments, the crRNA and tracrRNA are truncated at 3′-end and 5′-end, respectively:

and the duplex end is replaced with a small molecule non-nucleotide linker (nNt-linker, ligation1), instead of the tetraloop (GAAA) in sgRNA, to form a ligated dual lgRNA:

wherein “NNNNNNNNNN NNNNNNNNNNNNN” is a guide sequence of 17-20 nt, and N is preferably a ribonucleotide with intact 2′-OH, and wherein “

” is a chemical nNt-linker.

In other embodiments, tracrgRNA is a ligated dual oligonucleotide (via ligation2, the inner ligation between tracrgRNA1 and tracrgRNA2), or a multiple oligonucleotide. Non-limiting examples include:

In some embodiments, the crRNA and tracrRNA are further truncated, and non-limiting examples of resulting lgRNAs include:

The two RNA modules (crgRNA and tracrgRNA, crgRNA and dual-tracrgRNA or multiple-tracrgRNA) are synthesized chemically by phosphoramidite chemistry and ligation chemistry either on solid support(s) or in solution. Non-limiting examples of compounds (oligonucleotide azides, oligonucleotide alkynes, oligonucleotide thiols, and oligonucleotide maleimides) and synthetic methods include:

1. Ligation Via Formation of a 1,2,3-Triazole Cross-Linker

a. Direct modification of fully deprotected oligonucleotide amine (such as ON-1) provides RNA alkynes or azides (ON-2):

Non-limiting examples of crosslinking reagents include:

b. Azide and alkyne functions are introduced as phosphoramidites in chemical synthesis of the RNAs (ON-5 and ON-7):

Non-limiting examples of these phosphoramidites include:

c. Preparation of lgRNA by chemical ligation via click chemistry of crgRNA and tracrgRNA and annealing to form RNA duplex, or by annealing followed by chemical ligation via click chemistry.

2. Ligation Via Formation of a Thioether Cross-Linker

a. Thiol function is introduced as a phosphoramidite in chemical synthesis of the RNAs:

Non-limiting examples of these phosphoramidites include:

b. Direct modification of fully deprotected oligonucleotide amine (such as ON-1) provides the RNA maleimide (ON-10):

Non-limiting examples of maleimide crosslinking reagents include:

c. Preparation of lgRNA by chemical ligation via thio-ether formation between crgRNA and tracrgRNA and annealing to form RNA duplex, or by annealing followed by chemical ligation via thio-ether formation.

The above chemical ligations for ligation between crgRNA and tracrgRNA (ligation1) are applicable to formation of ligated dual tracrgRNA between tracrgRNA1 and tracrgRNA2 (ligation2).

In some embodiments, the two ligations (ligation1 and ligation2) are chemically orthogonal. Non-limiting examples include:

a. ligation2 by a triazole linker and ligation1 by a thioether linker;

b. ligation2 by a thioether linker and ligation1 by a triazole linker;

c. ligation2 by some linker and ligation1 by a second linker, which is chemically orthogonal to that for ligation2;

In other embodiments, the two ligations can be formed by the same chemistry such as an azide-alkyne [3+2] cycloaddition.

An aspect of the invention is directed to CRISPR-Cas9-lgRNA system:

wherein lgRNA is a single ligated RNA composed of dual-modules, a triple or a multiple ligated RNA composed of dual-modules, and ligation sites are preferably located at tetraloop (A32-U37), and/or stem 2 (C70-G79) (of reported sgRNA designed for S. pyogenes Cas9), while any 3′, 5′-phosphodiester of sgRNA can be replaced by single nNt-Linker as a ligation site; and wherein Cas9 composed of a nuclease lobe (NUC) and a recognition lobe (REC) can be a CRISPR-associated protein other than the example of S. pyogenes Cas9 represented here, and can be any engineered Cas9.

Some aspects of the invention are directed to CRISPR-Cas9-lgRNA-transfecting reagent(s) systems.

Some aspects of the invention are directed to the use of CRISPR-Cas9-lgRNA system for antiviral therapy targeting against proviral DNAs or episomal circular DNAs.

In some embodiments, non-limiting examples of targeted viral genomic sequences of HBV include:

(SEQ ID NO 36) ctctgctagatcccagagtg [aGG], (SEQ ID NO 37) gctatcgctggatgtgtctg [cGG], (SEQ ID NO 38) tggacttctctcaattttct [a[G[G]G]G]], (SEQ ID NO 39) gggggatcacccgtgtgtct [tGG], (SEQ ID NO 40) tatgtggatgatgtggtactgg [gGG], (SEQ ID NO 41) cctcaccatacagcactc [gGG], (SEQ ID NO 42) gtgttggggtgagttgatgaatc [tGG],

wherein nucleotides in [ ] are PAMs, or any sequence of 17-20 nt upstream adjacent to a PAM sequence. More examples of such sequences can be found in literature (Lee, et al., WO2016/197132A1). A complete list of such sequences can be found in viral genomic sequences of HBV using online Basic Local Alignment Search Tool (BLAST). Useful selection methods to identify sequences having extremely low to no homology between the foreign viral genome and host cellular genome including endogenous retroviral DNA include bioinformatics screening to minimize and/or exclude off-target human transcriptome and essential untranslated genomic sites, and to optimize the efficacy of target cleavage.

In some embodiments, the therapeutic Cas9-lgRNAs comprise multiple lgRNAs targeting at different sites in viral genome (multiplex editing).

In some embodiments, multiple crgRNAs, including but not limited to sequences of YMDD and its mutations at catalytic domain of HBV polymerase, corresponding to

(SEQ ID NO 43) tatgtggatgat gtggtactgg [gGG], (SEQ ID NO 44) tatatggatgat gtggtattgg [gGG], (SEQ ID NO 45) tatgtggatgat gtggtattgg [gGG], (SEQ ID NO 46) tatatagatgat gtggtactgg [gGG],

are ligated to tracrgRNA to result in a mixture of lgRNAs, and thus of Cas9-lgRNAs to target drug resistance in therapies based on direct-acting antiviral agents (DAA).

In some embodiments, non-limiting examples of targeted viral genomic sequences of HIV include:

(SEQ ID NO 47) gattggcaga actacacacc [aGG], (SEQ ID NO 48) atcagatatc cactgacctt [tGG], (SEQ ID NO 49) gcgtggcctg ggcgggactg [gGG], (SEQ ID NO 50) cagcagttct tgaagtactc [cGG],

wherein nucleotides in [ ] are PAMs, or any sequence of 17-20 nt upstream adjacent to a PAM sequence. A complete list of such sequences can be found in viral genomic sequences of HIV using online Basic Local Alignment Search Tool (BLAST). Useful selection methods to identify sequences having extremely low to no homology between the foreign viral genome and host cellular genome including endogenous retroviral DNA include bioinformatics screening to minimize and/or exclude off-target human transcriptome and essential untranslated genomic sites, and to optimize the efficacy of target cleavage.

In some embodiments, non-limiting examples of targeted viral genomic sequences of herpesviridae virus such as HSV and EBV include:

(SEQ ID NO 51) gccctggaccaacccggccc [gGG], (SEQ ID NO 52) ggccgctgccccgctccggg [tGG], (SEQ ID NO 53) ggaagacaatgtgccgcca [tGG], (SEQ ID NO 54) tctggaccagaaggctccgg [cGG], (SEQ ID NO 55) gctgccgcggagggtgatga [cGG], (SEQ ID NO 56) ggtggcccaccgggtccgct [gGG], (SEQ ID NO 57) gtcctcgagggggccgtcgc [gGG],

wherein nucleotides in [ ] are PAMs, or any sequence of 17-20 nt upstream adjacent to a PAM sequence. A complete list of such sequences can be found in viral genomic sequences of herpesviridae virus using online Basic Local Alignment Search Tool (BLAST). Useful selection methods to identify sequences having extremely low to no homology between the foreign viral genome and host cellular genome including endogenous retroviral DNA include bioinformatics screening to minimize and/or exclude off-target human transcriptome and essential untranslated genomic sites, and to optimize the efficacy of target cleavage.

Other aspects of the invention are directed to the use of CRISPR-Cas9-lgRNA systems for gene edition and therapy.

Process for Preparations of Nucleotides

Other embodiments of this invention represent processes for preparation of these compounds provided herein, which can also be prepared by any other methods apparent to those skilled in the art.

Chemical Ligations

In some embodiments, the ligation between crgRNA and tracrgRNA (ligation1) is formation of triazole by Cu(I) catalyzed [2+3] cycloaddition. An azide or alkyne function is introduced at 3′-end of crgRNA which is an aminoalkyl oligonucleotide as represented by a non-limiting example of ON-11:

An azide or alkyne function is introduced at 5′-end nucleotide of tracrgRNA as represented by a non-limiting example of ON-12 by solid phase supported synthesis:

TracrgRNA and crgRNA are then ligated by click reaction, and annealed as represented by a non-limiting example of ON-13:

or are annealed, and then ligated by click chemistry.

In other embodiments, the ligation between crgRNA and tracrgRNA is formation of thioether linker or any compatible crosslinking chemistry.

In some embodiments for triple lgRNAs, the ligation in tracrgRNA (ligation2) and the ligation between crgRNA and tracrgRNA (ligation1) are orthogonal, and two ligations are realized in one pot:

A maleimide is introduced to 3′-end of crgRNA in solid phase supported synthesis as represented by a non-limiting example of ON-14:

A thiol and an azide are introduced at 5′-end and 3′-end of tracrgRNA1, respectively as represented by a non-limiting example of ON-15:

An alkyne is introduced at 5′-end of tracrgRNA2 as represented by a non-limiting example of ON-16:

TracrgRNA1, tracrgRNA2, and crgRNA are then ligated, and annealed as represented by a non-limiting example of ON-17:

In some embodiments, the ligation in tracrgRNA (ligation2) and the ligation between crgRNA and tracrgRNA (ligation1) are orthogonal, and two ligations are realized in two sequential steps.

In other embodiments, the ligation in tracrgRNA (ligation2) and the ligation between crgRNA and tracrgRNA (ligation1) are formed by the same ligation chemistry, and two ligations are realized sequentially as represented by synthesis of ON-20:

Formation OF Cas9-lgRNA, Cellular Transfections, and ASSAYS

The formation of Cas9-lgRNA complex, and cellular transfections are performed as reported.

a. Transfection with cationic lipids (Liu, D. et al. Nature Biotechnology 2015, 33, 73-80):

Purified synthetic lgRNA or mixture of synthetic lgRNAs is incubated with purified Cas9 protein for 5 min, and then complexed with the cationic lipid reagent in 25 μL OPTIMEM. The resulting mixture is applied to the cells for 4 h at 37° C.

b. Transfection with cell-penetrating peptides (Kim, H. et al. Genome Res. 2014, 24: 1012-1019):

Cell-penetrating peptide (CPP) is conjugated to a purified recombinant Cas9 protein (with appended Cys residue at the C terminus) by drop wise mixing of 1 mg Cas9 protein (2 mg/mL) with 50 μg 4-maleimidobutyryl-GGGRRRRRRRRRLLLL (m9R; 2 mg/mL) in PBS (pH 7.4) followed by incubation on a rotator at room temperature for 2 h. To remove unconjugated 9mR, the samples are dialyzed against DPBS (pH 7.4) at 4° C. for 24 h using 50 kDa molecular weight cutoff membranes. Cas9-m9R protein is collected from the dialysis membrane and the protein concentration is determined using the Bradford assay (Biorad).

Synthetic lgRNA or a mixture of synthetic lgRNAs is complexed with CPP: lgRNA (1 μg) in 1 μl of deionized water is gently added to the C3G9R4LC peptide (9R) in lgRNA:peptide weight ratios that range from 1:2.5 to 1:40 in 100 μl of DPBS (pH 7.4). This mixture is incubated at room temperature for 30 min and diluted 10-fold using RNase-free deionized water.

150 μl Cas9-m9R (2 μM) protein is mixed with 100 μl lgRNA:9R (10:50 μg) complex and the resulting mixture is applied to the cells for 4 h at 37° C. Cells can also be treated with Cas9-m9R and lgRNA:9R sequentially.

The antiviral assay is performed according to reported procedures (Hu, W. et al. Proc Natl Acad Sci USA 2014, 110: 11461-11466; Lin, Su. et al. Molecular Therapy—Nucleic Acids, 2014, 3, e186). Delivery to cell lines is based either cationic lipid or CPP based delivery of Cas9-lgRNA complexes instead of plasmid transfection/transduction using gRNA/Cas9 expression vectors.

EXAMPLES

The following examples further illustrate embodiments of the disclosed invention, which are not limited by these examples.

Example 1: Compound 4

Compound 2 is prepared essentially according to a reported procedure in 4 steps (Santner, T. et al. Bioconjugate Chem. 2014, 25, 188-195). Compound 1 is treated with 2-azidoethanol in dimethylacetamide at 120° C. at the presence of BF₃.OEt₂, and the resulting 2′-azido nucleoside is tritylated (DMTrCl, in pyridine, RT), and attached to an amino-functionalized support.

To compound 2 (0.99 mmol) in THF/H₂O (2:1) (18 mL) is then added trimethylphosphine (1.5 mL, 1.5 mmol). Reaction is shaken at room temperature for 8 h and washed thoroughly with THF/H₂O (2:1). Dioxane/H₂O (1:1) (20 mL) and NaHCO₃ (185 mg, 2.2 mmol) are added. The reaction mixture is cooled down to 0° C., and Fmoc-OSu (415 mg, 1.23 mmol) in dioxane (2 mL) is added. The reaction is shaked for 15 min at 0° C., and then washed with water and then THF (3×20 mL) to give compound 4.

Example 2: Compound 7

6-Benzoyl-5′-O-DMTr-adenosine 5 (1.24 g, 1.84 mmole) is dissolved in THF (40 mL) and sodium hydride (60% dispersion in mineral oil, 0.184 g, 4.6 mmole) is added in portions at 0° C. The reaction mixture is warmed up to room temperature and stirred for 15 min. The reaction is then cooled to 0° C., and propargyl bromide (80% in toluene, 0.44 mL, 3.98 mmole) is added. The reaction is then stirred under reflux for 12 h. Saturated aqueous sodium bicarbonate (10 mL) is added, and volatiles are removed in vacuo. The resulting residue is dissolved in DCM, and washed with water and saturated brine. The organic layer is collected, dried over anhydrous sodium sulfate, and concentrated till dryness in vacuo. The resulting residue is purified by silica-gel chromatography (eluent: 97:3, DCM:MeOH, 0.5% pyridine) to provide compound 6.

To a solution of compound 6 (0.30 g, 0.42 mmol) in anhydrous dichloromethane (5 mL) and diisopropylethylamine (0.15 mL, 0.83 mmol), under nitrogen, is added 2-cyanoethyl-N,N-diisopropyl-chlorophosphoramidite (0.13 mL, 0.58 mmol) dropwise. The reaction is stirred at room temperature for 3 h, and then diluted with dichloromethane (25 mL). The resulting solution is washed with saturated aqueous KCl (25 mL), dried from anhydrous Na₂SO₄, and concentrated in vacuo. The residue is purified by column chromatography (60% EtOAc/hexane, 0.5% pyridine) to give compound 7.

Example 3: ON-11

ON-11 is prepared using 2′-TBS protected RNA phosphoramidite monomers with t-butylphenoxyacetyl protection of the A, G and C nucleobases and unprotected uracil. 0.3 M Benzylthiotetrazole in acetonitrile (Link Technologies) is used as the coupling agent, t-butylphenoxyacetic anhydride as the capping agent and 0.1 M iodine as the oxidizing agent. Oligonucleotide synthesis is carried out on an Applied Biosystems 394 automated DNA/RNA synthesizer using the standard 1.0 μmole RNA phosphoramidite cycle. Compound 4 (20 mg) is packed into a twist column. All β-cyanoethyl phosphoramidite monomers are dissolved in anhydrous acetonitrile to a concentration of 0.1 M immediately prior to use. Stepwise coupling efficiencies are determined by automated trityl cation conductivity monitoring and in all cases are >96.5%.

Fmoc is then cleaved by treatment with 20% piperidine in DMF. The resulting 3′-end aminoethyl oligonucleotide is then treated with NHS ester of 6-azido caproic acid in DMF.

Cleavage of oligonucleotides from the solid support and deprotection are achieved by exposure to concentrated aqueous ammonia/ethanol (3/1 v/v) for 2 h at room temperature followed by heating in a sealed tube for 45 min at 55° C. and desilylation in 1.0 M TBAF in THF for 24 h.

The above resulting oligonucleotide is dissolved in 0.5 M Na₂CO₃/NaHCO₃ buffer (pH 8.75) and incubated with succinimidyl-6-azidohexanate (20 eq.) in DMSO to give ON-11.

Purification of the oligonucleotide is carried out by reversed-phase HPLC on a Gilson system using an XBridge™ BEH300 Prep C18 10 μM 10×250 mm column (Waters) with a gradient of acetonitrile in ammonium acetate (0% to 50% buffer B over 30 min, flow rate 4 mL/min), buffer A: 0.1 M ammonium acetate, pH 7.0, buffer B: 0.1 M ammonium acetate, pH 7.0, with 50% acetonitrile. Elution is monitored by UV absorption at 295 nm. After HPLC purification, the oligonucleotide is desalted using an NAP-10 column.

Example 4: ON-12

ON-12 is synthesized in a way similar to the synthesis of ON-11, except that commercially available solid phase supported uridine phosphoramidite is used, and the last nucleoside phosphoramidite is compound 7.

The oligonucleotide is cleaved off the solid phase and fully deprotected, and purified as in example 3.

Example 5: ON-13

A solution of alkyne ON-12 and azide ON-11 (0.2 nmol of each) in 0.2 M NaCl (50 μL) is annealed for 30 min at room temperature. In the meantime tris-hydroxypropyl triazole ligand (28 nmol in 42 μL 0.2 M NaCl), sodium ascorbate (40 nmol in 4 μL 0.2 M NaCl) and CuSO₄.5H₂O (4 nmol in 4.0 μL 0.2 M NaCl), are added under argon. The reaction mixture is kept under argon at room temperature for the desired time, and formamide (50 μL) is added. The reaction is analyzed by and loading directly onto a 20% polyacrylamide electrophoresis gel, and purified by reversed-phase HPLC.

Example 6: ON-17

The ON-14 oligonucleotide carrying a maleimido group is incubated with 5′-SH-oligonucleotide (ON-15, 1:1 molar ratio) in 0.1 M triethylammonium acetate (TEAA) at pH 7.0 overnight at room temperature. The solution is evaporated to dryness, and to the resulting mixture is added ON-16 in 250 μM final concentration in phosphate-buffered saline (PBS), pH 7.4, containing 0.7% DMF and incubated at room temperature. The reaction mixture is analyzed and separated by HPLC to give ON-17.

Example 7: Cas9::ON-13 Complex

Cas9 protein: Recombinant Cas9 protein is available from New England BioLabs, Inc. and other providers or purified from E. coli by a routinely used protocol (Anders, C. and Jinek, M. Methods Enzymol. 2014, 546, 1-20). The purity and concentration of Cas9 protein are analyzed by SDS-PAGE.

Cas9-lgRNA Complex:

Cas9 and lgRNA are preincubated in a 1:1 molar ratio in the cleavage buffer to reconstitute the Cas9-lgRNA complex. 

What is claimed is:
 1. A cell treated with a composition comprising at least one lgRNA, and said lgRNA comprises: a. a synthetic crRNA, comprising a spacer and a second oligonucleotide, wherein i. said spacer is an oligonucleotide of greater than 12 bases that targets a DNA sequence, and said second oligonucleotide is an RNA segment of 8-25 nucleotides, ii. said spacer and said second oligonucleotide are joined between the 3′-end of said spacer and the 5′-end of said second oligonucleotide via an internucleotide phosphate diester, a thiophosphate diester

 a boranophosphate diester

 a phosphonoacetate

 a thiophosphonoacetate

 a phosphoramidate

 or a thiophosphoramidate

 and iii. said spacer and said second oligonucleotide comprise a single or a plurality of nucleosides or nucleotides modified at sugar moieties selected from the group consisting of:

wherein Q is a natural or modified nucleobase, b. a synthetic tracrRNA, comprising: i. a single or a plurality of oligonucleotides, ii. said oligonucleotides of b. i. are sequentially joined between the 3′-end and the 5′-end via an internucleotide phosphate diester, a thiophosphate diester

 a boranophosphate diester

 a phosphonoacetate

 a thiophosphonoacetate

 a phosphoramidate

 or a thiophosphoramidate

 or an nNt-Linker set forth as in c., and iii. said oligonucleotides of b. i. comprise a single or a plurality of nucleosides or nucleotides modified at sugar moieties, wherein said nucleosides and nucleotides are selected from the group consisting of II-1 to 11-37, c. one or more nNt-Linkers, comprising: i. an optionally substituted M core structure of Formula M-1 to M-18:

 wherein X=O, S, NH, or CH₂, X₁═N or CH, X₂═N or CH, R_(M)═H, CH₃, alkyl, aryl, or heteroaryl, m=0 to 3 and n=0 to 3, ii. two L linkers, and each said L linker comprises absent or more structures selected from the group consisting of L-1 to L-24:

 where m=0 to 16 and n=0 to 16, and wherein iii. said L linkers and said M core structure are joined as L-M-L, wherein the two L linkers are the same or different, and attached to two terminal nucleotides of Formula Nuc-1 to Nuc-19:

 wherein the attached positions are

 to L-M-L and

 or

 to upstream and downstream oligonucleotides, respectively, and wherein R is H, OH,

 CH₂OH,

 F, NH₂, OMe, CH₂OMe, OCH₂CH₂OMe, an alkyl, a cycloalkyl, an aryl, or a heteroaryl, R′ is H, OH,

 CH₂OH,

 F, NH₂, OMe, CH₂OMe, OCH₂CH₂OMe, an alkyl, a cycloalkyl, an aryl, or a heteroaryl, and Q is a natural or a non-natural nucleic acid base, wherein an nNt-Linker of c. joins the 3′-terminal nucleotide of said synthetic crRNA of a. and the 5′-terminal nucleotide of said synthetic tracrRNA of b. and said at least one nNt-Linkers of c. are positioned outside the bound regions of said lgRNA by a CRSPR-associated-protein.
 2. Said cell of claim 1 comprising at least one said lgRNA.
 3. Said cell of claim 1, wherein said composition further comprises a polypeptide selected from the group consisting of a Cas9, a Cas9 with reduced nuclease activity, a Cas9 with nickase activity, a Cas9 with no nuclease activity, and a fusion protein comprising a Cas9 domain, wherein the Cas9 domain is capable of binding with the activating duplex region and wherein the fusion protein further comprises a domain from a polypeptide other than Cas9.
 4. Said cell of claim 3, wherein said polypeptide is replaced with its encoding nucleic acid.
 5. Said cell of claim 1, wherein said composition further comprising a fusion protein comprising a Cas9 domain, wherein the Cas9 domain is capable of binding with the activating duplex region and wherein the fusion protein further comprises a domain from a polypeptide other than Cas9 and which confers an additional activity on the site-directed polypeptide selected from the group consisting of nuclease activity, methyltransferase activity, demethylase activity, DNA repair activity, DNA damage activity, deamination activity, dismutase activity, alkylation activity, depurination activity, oxidation activity, pyrimidine dimer forming activity, integrase activity, transposase activity, recombinase activity, polymerase activity, ligase activity, helicase activity, photolyase activity, glycosylase activity, acetyltransferase activity, deacetylase activity, kinase activity, phosphatase activity, ubiquitin ligase activity, deubiquitinating activity, adenylation activity, deadenylation activity, SUMOylating activity, deSUMOylating activity, ribosylation activity, deribosylation activity, myristoylation activity and demyristoylation activity.
 6. Said cell of claim 5, wherein said fusion protein is replaced with its encoding nucleic acid.
 7. Said cell of claim 1, wherein said composition further comprises one or more DNA repair templates selected from the group consisting of ssDNA and dsDNA.
 8. Said DNA repair template of claim 7 comprising an oligonucleotide sequence to introduce insertion(s) of one or more stop codons selected from the group consisting of 5′-(tga)-3′, 5′-(taa)-3′, 5′-(tag)-3′, 5′-(tga-ntga-ntga)-3′, 5′-(tga-ntga-ntaa)-3′, 5′-(tga-ntga-ntag)-3′, 5′-(tga-ntaa-ntga)-3′, 5′-(tga-ntaa-ntaa)-3′, 5′-(tgantaa-ntag)-3′, 5′-(tga-ntga-ntga)-3′, 5′-(tga-ntga-ntaa)-3′, 5′-(tga-ntga-ntag)-3′, 5′-(taa-ntga-ntga)-3′, 5′-(taa-ntga-ntaa)-3′, 5′-(taa-ntga-ntag)-3′, 5′-(taa-ntaa-ntga)-3′, 5′-(taa-ntaa-ntaa)-3′, 5′-(taa-ntaa-ntag)-3′, 5′-(taa-ntga-ntga)-3′, 5′-(taa-ntga-ntaa)-3′, 5′-(taa-ntga-ntag)-3′, 5′-(tag-ntga-ntga)-3′, 5′-(tag-ntga-ntaa)-3′, 5′-(tag-ntga-ntag)-3′, 5′-(tag-ntaa-ntga)-3′, 5′-(tag-ntaa-ntaa)-3′, 5′-(tag-ntaa-ntag)-3′, 5′-(tag-ntga-ntga)-3′, 5′-(tag-ntga-ntaa)-3′, 5′-(tag-ntga-ntag)-3′, wherein n is any nucleotide, and said more stop codons comprises repetitive said sequence separated by absent or more nucleotides in between or different said sequences separated by absent or more nucleotides in between.
 9. Said DNA repair template of claim 7 comprising an oligonucleotide sequence to introduce insertion(s) of one or more transcription cis-regulatory elements, and said more elements comprises repetitive sequence separated by absent or more nucleotides in between or different sequences separated by absent or more nucleotides in between.
 10. Said cell of claim 7, wherein said DNA repair template is covalently linked to said lgRNA.
 11. Said cell of claim 1, wherein said lgRNA is covalently linked with one or more molecules selected from the group consisting of fluorescent molecules, PEGs, non-PEG polymers, ligands of cellular receptors, lipids, oligonucleotides, polysaccharides, glycans, peptides, aptamers and antibodies to form an lgRNA conjugate, and the said more molecules can be the same or different.
 12. Said cell of claim 1, wherein said composition further comprises a carrier containing a DNA repair template, and said carrier is selected from the group consisting of an AAV vector, a plasmid and a retron.
 13. Said cell of claim 1, wherein the cell is selected from the group consisting of a bacterial cell, an archaeal cell, a plant cell, an algal cell, a fungal cell, an invertebrate cell, a vertebrate cell, a mammalian cell, and a human cell.
 14. Said cell of claim 1, wherein the cell comprises deactivated integrated viral DNAs or deactivated pathogenic genes.
 15. Said cell of claim 1, wherein the cell comprises deactivated integrated HBV DNAs.
 16. Said cell of claim 1, wherein the cell comprises deactivated integrated HIV DNAs.
 17. Said cell of claim 1, wherein the cell comprises a deactivated CCR5 gene.
 18. Said cell of claim 1, wherein the cell comprises deactivated integrated HSV DNAs.
 19. Said cell of claim 1, wherein the cell comprises an edited host gene.
 20. Said cell of claim 1, wherein the cell comprises one or more inserted genes.
 21. Said cell of claim 1, wherein the cell further comprises other therapeutic agents.
 22. Said cell of claim 2, wherein said at least one lgRNA is introduced into the cell by lipofection, electroporation, nucleofection, microinjection, biolistics, in liposomes, immunoliposomes, with polycations, as nucleic acid conjugates, or combinations thereof.
 23. A cell comprising at least one said ligated tracrRNA of claim
 1. 24. Said cell of claim 23, wherein said tracrRNA is covalently linked to one or more molecules selected from the group consisting of fluorescent molecules, PEGs, non-PEG polymers, ligands of cellular receptors, lipids, oligonucleotides, polysaccharides, glycans, peptides, aptamers and/or antibodies to form a tracrRNA conjugate, and the said more molecules can be the same or different. 